DOCLIB : A Document Processing Research Tool
نویسندگان
چکیده
Often, valuable document processing intellectual capital is lost due to staff transitions or project restructuring prior to technology transfer. Furthermore, hardware and software integrity, dependencies, and compatibility are critical components that often impede technology migration. While many open source tools attempt to mitigate these issues, they do not always address specific design needs and tailored-process that Government organizations must adhere to. This paper addresses the need for a common document processing research vehicle through which institutions can develop and share researchrelated software and applications across academic, business, and Government domains.
منابع مشابه
DOCLIB: a software library for document processing
Most researchers would agree that research in the field of document processing can benefit tremendously from a common software library through which institutions are able to develop and share research-related software and applications across academic, business, and government domains. However, despite several attempts in the past, the research community still lacks a widely-accepted standard so...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملDocument Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)
Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...
متن کاملElectronic Publishing in the Online Journal "Forum: Qualitative Social Research" (FQS)
In this work we present the electronic publishing process of the Online Journal “Forum: Qualitative Social Research” (FQS). We introduce technologies and tools used to optimize the publishing process, to provide sustainability of FQS publications and to extend the communicative possibilities, but also to net the FQS with other social research Internet resources and to make it accessible to othe...
متن کاملPlagiarism checker for Persian (PCP) texts using hash-based tree representative fingerprinting
With due respect to the authors’ rights, plagiarism detection, is one of the critical problems in the field of text-mining that many researchers are interested in. This issue is considered as a serious one in high academic institutions. There exist language-free tools which do not yield any reliable results since the special features of every language are ignored in them. Considering the paucit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005